DPRml: distributed phylogeny reconstruction by maximum likelihood

نویسندگان

  • Thomas M. Keane
  • Thomas J. Naughton
  • Simon A. A. Travers
  • James O. McInerney
  • Grace P. McCormack
چکیده

MOTIVATION In recent years there has been increased interest in producing large and accurate phylogenetic trees using statistical approaches. However for a large number of taxa, it is not feasible to construct large and accurate trees using only a single processor. A number of specialized parallel programs have been produced in an attempt to address the huge computational requirements of maximum likelihood. We express a number of concerns about the current set of parallel phylogenetic programs which are currently severely limiting the widespread availability and use of parallel computing in maximum likelihood-based phylogenetic analysis. RESULTS We have identified the suitability of phylogenetic analysis to large-scale heterogeneous distributed computing. We have completed a distributed and fully cross-platform phylogenetic tree building program called distributed phylogeny reconstruction by maximum likelihood. It uses an already proven maximum likelihood-based tree building algorithm and a popular phylogenetic analysis library for all its likelihood calculations. It offers one of the most extensive sets of DNA substitution models currently available. We are the first, to our knowledge, to report the completion of a distributed phylogenetic tree building program that can achieve near-linear speedup while only using the idle clock cycles of machines. For those in an academic or corporate environment with hundreds of idle desktop machines, we have shown how distributed computing can deliver a 'free' ML supercomputer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SPIML: a machine learning approach to phylogenomics

Sequencing of complete genomes from multiple related species within a single clade offers an opportunity to study phylogeny at the genome level. Within this context, the accurate reconstruction of the evolutionary histories of gene families has become a prominent albeit challenging goal. Previously, we have shown that the level of inaccuracy exhibited by current gene-tree reconstruction methods...

متن کامل

Multi-SpaM: a Maximum-Likelihood approach to Phylogeny reconstruction based on Multiple Spaced-Word Matches

Word-based or ‘alignment-free’ methods for phylogeny reconstruction are much faster than traditional approaches, but they are generally less accurate. Most of these methods calculate pairwise distances for a set of input sequences, for example from word frequencies or from so-called spaced-word matches. In this paper, we propose the first word-based approachto tree reconstruction that is based ...

متن کامل

Short Quartet Puzzling: A New Quartet-Based Phylogeny Reconstruction Algorithm

Quartet-based phylogeny reconstruction methods, such as Quartet Puzzling, were introduced in the hope that they might be competitive with maximum likelihood methods, without being as computationally intensive. However, despite the numerous quartet-based methods that have been developed, their performance in simulation has been disappointing. In particular, Ranwez and Gascuel, the developers of ...

متن کامل

A Practical Algorithm for Estimation of the Maximum Likelihood Ancestral Reconstruction Error

The ancestral sequence reconstruction problem asks to predict the DNA or protein sequence of an ancestral species, given the sequences of extant species. Such reconstructions are fundamental to comparative genomics, as they provide information about extant genomes and the process of evolution that gave rise to them. Arguably the best method for ancestral reconstruction is maximum likelihood est...

متن کامل

Maximum Likelihood 3D Reconstruction from One or More Images under Geometric Constraints

We address the 3D reconstruction of scenes in which some planarity, collinearity, symmetry and other geometric properties are known à-priori. Our main contribution is a reconstruction method that has advantages of both constraintbased and model-based methods. Like in the former, the reconstructed object needs not be an assemblage of predefined shapes. Like in the latter, the reconstruction is a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 21 7  شماره 

صفحات  -

تاریخ انتشار 2005